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Abstract. —Four members of the alpha-tubulin gene family were examined in Ceratopteris 
richardii. Genetic linkage mapping based on a population of nearly 500 Doubled Haploid Lines was 
able to position three or four members of this gene family on linkage groups 17, 24, and 28, 
respectively (two of the four observed polymorphic restriction fragments containing alpha-tubulin 
genes are either identical or map too close to each other on linkage group 17 to he distinguishable 
in map distance). Non-mapable monomorphic hands observed on probed Southern blots suggest 
that the alpha-tubulin gene family in this species is large. Four alpha-tubulin genes from C. 
richardii were sequenced and found to he fairly similar to each other in terms of their amino acid 
sequences, with their greatest diversity at the carboxy-terminal ends. BLAST comparisons found 
each of these four amino acid sequences more similar to an alpha-tubulin from a dicot, 
gymnosperm, or alga species than it was to any other alpha-tubulin sequence presently known 
from Ceratopteris or from the fern Anemia phvllitidis or the moss Physcomitrella patens. Bayesian 
phylogenetic analysis of nucleotide sequences placed three of the four Ceratopteris alpha-tubulin 
gene copies in a clade with copies from Pseudotsuga and Anemia , consistent with a history of two 
gene duplication events, one following and one preceding the divergence of ferns and seed plants. 
The fourth copy is robustly separated from the preceding three and placed in a clade of algal alpha- 
tubulin genes, suggesting its divergence from the ancestor of the other three before the divergence 
ot algae and land plants. As characterized thus lar, the alpha-tubulin gene family of C. richardii is 
relatively large as compared to the six copies known from fully sequenced Arabidopsis thaliana. 
a condition that may he correlated with the large genome size and diverse life history constraints of 
this homosporous fern species. These findings suggest several new opportunities for research into 
the evolution, function, and regulation of the alpha-tubulin gene family in Ceratopteris. 


This report describes the application of DNA sequencing and genetic linkage 
mapping to the alpha-tubulin genes of Ceratopteris richardii and shows how 
such studies can further enhance the utility of this model system. Initial 

K. 

successes with Ceratopteris were in the areas of cytogenetics (e.g., Hickok. 
1976, 1977a, 1977b, 1978, 1979a, 1979b; Hickok and Klekowski, 1973, 1974). 
physiology and development, and Mendelian genetics (reviewed in Hickok, 
1987; Hickok et ah, 1987, 1995). More recent studies have focused on the 
molecular genetics of Ceratopteris (Munster et ah, 1997; Hasebe et ah. 1998; 
Aso el ah, 1999; Stout et ah, 2003; Rutherford et ah, 2004; Salmi et ah , 2005). 
These recent studies that employ robust molecular methods are especially 
encouraging, since the lack of a technique to induce stable genetic trails- 
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formation in Ceratopteris has undoubtedly impeded its function as a unifying 
model system. In this report, DNA sequences from four unique alpha-tubulin 
genes of C. richardii are described, and the phylogenetic relationships ol these 


i 


genes to other plant alpha-tubulin genes are inferred. Molecular linkage daU 
for three polymorphic alpha-tubulin loci of C. richardii are used to position 
these loci on the new genetic linkage map ol this species, and several potential 
strategies for using this new information in molecular studies are outlined. 
These perspectives should provide further encouragement for those seeking to 
utilize Ceratopteris as a model system. 

The alpha-tubulin gene family was selected for these studies because ol the 
biological significance and traclabi 1 ity of tubulin in the gametophyte 
generation of ferns and because of the new insights that may be gained lor 


genomics by studying them. Inferring 


evolutionary relationships among 


members of gene families is generally problematic because distinguishing 
paralogues and orthologues is difficult without appropriate known outgroups 

to 



the gene phvlogeny. Tubulin genes are ideal lor studying gene 
family evolution because 1) alpha-, beta-, and gamma-tubulin genes are known 
to have diverged well belore the divergence ol plant lineages, and 2) the highly 
conserved nature of the genes allows their sequences to be aligned. These 
features allow inference of gene phylogenies within a given tubulin gone 
group, using sequences from other tubulin groups as outgroups. 

Tubulins and the microtubules they form are obviously essential compo¬ 


nents of all eukaryotic cells. In fern gametophvtes, however, their roles are 
directly observable in various stages of development that can be easily studied. 
For example, one of the first events that signals preparation for spore 
germination is the migration of the nucleus within the 



asm (Banks, 

1999). This event, which is critical to the continued development ol the 

e. is inhibited in Onoclea sensibilis by several microtubule 


game 



in 


inhibitors, including colchicine (Vogelmann el al ., 1981). An important role 
for microtubules at a later stage of development has been suggested by the 
studies of Murata el al. (1997) on blue light-induced inhibition of cell growth 
dark-grown Ceratopteris gametophvtes. These investigators found that 
cortical microtubules reorient in response to blue light at the same time that 
inhibition of cell elongation occurs. Microtubules also serve a key role in the 
organization and function of fern sperm (Raghavan, 1989), and Ceratopteris 
sperm has been used extensively in studies characterizing these roles 
(Hoffman and Vaughn, 1995a, 1995b, 1996; Hoffman et al., 1994; Renzaglia 
el al., 2 



Because homosporous ferns compose most of the sister group to seed plants 
(including flowering plants; Prver et al., 2001), increased knowledge of 
genome structure and organization in the genomes of homosporous lerns will 
significantly broaden our knowledge of genome structure and evolution in 
vascular plants. Furthermore, characterization of the alpha-tubulin gene 
family in Ceratopteris will ultimately help to answer questions specitically 
related to the origin and function of gene families in organisms with large 
genomes. At 11,294 Mb (Jo Ann Banks, personal communication), the genome 
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oi C. richardii is ca. 110 times the genome size of Arabidopsis thaliana , 24 
times that of rice, 12 times that of tomato, 4 times that of maize, and 2 times 
that of barley. Some relevant questions include the following. Do organisms 
with larger genomes have larger gene families? Do such organisms (with their 
apparent excess amounts of DNA) have more pseudogenes among the members 
of their gene families? How are the various members of gene families regulated 
during development in organisms with large genomes? The latter question is 
an especially interesting one as it relates to the ferns, whose alternation of 
generations features fully independent sporophyte and gametophyte genera¬ 
tions. 

This study employed two distinct strategies for characterizing alpha-tubulin 
genes from C. richardii. The first approach involved identifying, isolating, and 
sequencing alpha-tubulin genes that were expressed in the gametophyte 
generation. This was accomplished 
method with a cDNA library derived from the gametophyte generation. 
Bayesian Inference analyses determined the phylogenetic relationships of 
these newly obtained sequences to other plant alpha-tubulin genes. The 
second approach utilized a mapping strategy developed for the recently 
completed project of Nakazato et al. (2006) to generate a high-resolution 
linkage map for C. richardii. This mapping project used genetic polymorph¬ 
isms present in a population of nearly 500 Doubled Haploid Lines (DHLs) 
generated from an initial cross between diploid inbred lines of highly diverged 
geographic races of C. richardii (Hickok et ah, 1995): (1>N8 (derived from 
a Nicaraguan collection) and Hoc-PQ45, a mutant of Hn-n (derived from a Cuban 
collection). Together these two approaches provide detailed insights regarding 
a previously uncharacterized gene family in C. richardii. These new insights 
suggest novel strategies for studying the alpha-tubulin gene family at the level 
of development and gene regulation and also at the level of the genome. 


by using an antibody-based 


screening 


Materials and Methods 

cDNA library screening. —The cDNA library used in this study was a gift 
from Jo Ann Banks (Purdue University). It was made from 12-day-old cultures 
of C. richardii gametophytes of the Hn-n strain containing both males and 
hermaphrodites. The cDNA was cloned into the EcoRl site of the lambda 
ZipLox bacteriophage vector (Life Technologies). 'The aliquot used to screen 
for alpha-tubulin genes was from a sample that had been amplified from the 
original library. The library was screened using standard methods for detecting 
specific proteins via antibody labeling (Young and Davis, 1991). In brief, the 
bacteriophage vector was first grown on a lawn of bacteria and, after plaques 
were produced, gene expression was induced with IPTG. Proteins from the 
library were adhered to nylon membranes and detected by hybridization to an 
anti alpha-tubulin monoclonal antibody (Sigma, catalogue number T5168). 
The presence of this antibody was detected with a labeled secondary antibody. 
After initial detection and isolation of plaques that tested positive for 
expression of alpha-tubulin, several rounds of re-screening were conducted 
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to eliminate contaminating vectors that were not positive tor alpha-tubulin. 
Bacteriophage vectors containing alpha-tubulin cDNAs were converted to 
plasmids via an in vivo excision process as described in technical relerence 
materials supplied by Life Technologies. 

DNA sequencing. —Plasmids with the cDNA inserts were used as the 

, at least two sequencing reactions 



sequencing templates. For each 
were conducted using the two primer sites that flank the EcoRI site where the 
cDNA is inserted. When the results from two sequencing reactions indicated 
that the insert was longer than 1220 base pairs, two additional pairs of primers 
were constructed based on the new sequence data. These new internal 


sequencing primers provided more reliable results lor the central regions of 
these longer inserts. The sequence reads from either two or lour reactions were 
assembled using the software program ContigExpress (NTI Vector). Ambigu¬ 
ities that resulted from discrepancies between overlapping sequences were 
resolved by relying on the sequence(s) with the clearest chromatogram. 

BLAST comparisons. —The sequences were compared using the standard 
BLAST program for proteins (protein-protein BLAST) provided on the NCB1 
web site to compare individual sequences with other sequences contained in 
the GenBank database as described in the Results section. The Composition- 
based statistic option and the filter options were not enabled so that the results 




on similarities and/or 
GenBank database is 
the reported BLAST 


of the BLAST comparisons would be based 
differences of individual amino acids. Since 
constantly being updated, it should be noted 
comparisons were last confirmed on July 29, 2005. 

Phylogenetic analysis of alpha-tubulin gene relationships. —A Bayesian 
Inference analysis of nucleotide sequences was used to infer the phylogenetic 
relationships of the four C. ric.hardii alpha-tubulin genes with regard to other 
plant alpha-tubulin gene nucleotide sequences available in GenBank lor 
Viridiplantae. Sequences used were from a broad selection of plants, including 


those corresponding to the results of the amino acid BLAST search above, 
except that nucleotide sequences precisely corresponding to the amino acid 
sequences of Gossypium hirsutuin Q6VAG0 and Chlamydomonas reinhardtii 
P09204 could not be located in GenBank. Prior to analysis, nucleotide 


sequences 


were aligned visually. The 


first three nucleotides, which were 


invariable, and the last 45 nucleotides, which caused ambiguous alignments, 
were removed. MrModeltest 2.2, a modified version of Modeltesl 3.6 (Posada 
and Crandall, 1998) determined that the suitable nucleotide substitution 
model for the analyzed sequences is GTR+I+G. Bayesian analysis was 
conducted using MrBayes 3.1 (Huelsenbeck and Ronquist, 2001; Ronquist 
and Huelsenbeck, 2003) with three million generations and a sample 
frequency of 1000. The first 300 "burn-in" trees were discarded alter the 
analysis. Five independent runs using the same setting converged to identical 
trees, except that one node was resolved in only one run. Two beta-tubulin 
genes were used as an outgroup to root the resulting tree. The species used and 
their GenBank accession numbers are identified in the Results section, as are 
the Bayesian posterior probability confidence values for the tree's clades. 
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Linkage mapping of the alpha-tubulin genes. —A genetic linkage map of the 
Ceratopteris richardii genome was developed independently of this alpha- 
tubulin study (Nakazato et ah, 2006). The mapping population of —500 
Doubled Haploid Lines (DHLs) was generated by intragametophytic selling of 
gametophytes derived from spores of an initial cross between diploid inbred 
lines of highly diverged geographic races of C. richardii (Hickok et ah, 1995): 
</>N8 (derived from a Nicaraguan collection, Nichols 1719, GH) and H7.-PQ45, 
a mutant of Hn-n (derived from a Cuban collection, Killip 44595, GH). The map 
is based on analysis of Restriction Fragment Length Polymorphism (RFLP), 
Amplified Fragment Length Polymorphism (AFLP), and allozyme markers. 
RFLPs used in the general mapping project were generated from genomic DNA 
of parental and DHL sporophytes digested with FcoRI and HindUl, separated 
on 0.8% agarose gels in IX TAIL and Southern blotted to nylon membranes. 
Further details of the methods used in development of mapping materials and 
map construction are in Nakazato and Gastony (2006) and Nakazato et al. 
(2006). Alpha-tubulin gene copies were located on the linkage map in the 
following way. RFLPs containing the alpha-tubulin genes were detected by 
probing the Southern blots with an alpha-tubulin probe made cheiluminescent 
by digoxigenin (DIG)-labeling according to a protocol optimized for the C. 
richardii genome mapping project. The alpha-tubulin probe used is C. 
richardii cDNA clone Cri 10_E18_SP6 (GenBank sequence num¬ 
ber BQ086953), from the cDNA library of C. richardii gametophytic tissue 
provided by Jo Ann Banks (Purdue University). This clone sequence 
corresponds to GenBank sequence AY231146 of TuaCRl in Table 1, according 
to an NGBI BLAST search using blastn, which found sequence identity at 651/ 
652 (99.85%) of the bases compared. Probing of the mapping project’s 
Southern blots with the alpha-tubulin probe was carried out toward the end 
of the mapping project when the Southern blots were beginning to wear out. 
Thus the quality of the autorads presented here are suboptimal but neverthe¬ 
less scorable. Parental alpha-tubulin RFLPs segregating in the DHL mapping 
population were scored and placed on the linkage maps by MAPMAKER/EXP 
3.0 (Lander et ah, 1987) at a UNIX workstation at Indiana University, 
Department oi Biology, at settings used for the general mapping project. 


Results 

Sequencing results. —A cursory comparison of the amino acid sequences for 
the four Ceratopteris alpha-tubulin genes (Fig. 1) suggests immediately that 
they are relatively similar to one another. Although only one of the four 
sequences is complete, it is possible to note at least two trends directly from 
the comparison presented (Fig. 1). First, for the portion where sequences are 
available for all four genes (i.e., residues 195 to 332 of TuaCR l), TuaCR3 and 
TuaCR4 are the most distinct. In this region, TuaCR3 contains five unique 
amino acid residues and TuaCR4 contains four, while the other two sequences 
each possess only a single unique amino acid. The second observation that can 
be made directly from these data relates only to the three sequences that 
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'Faulk 1. A comparison ot the ten Cvvulopturis alpha-tubulin sequences identified in this study to 
other alpha-tubulin sequences recorded in OenBank. 



(ienBank 
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Size of 

Accession number, dt 

ascription, and 

Names for 
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DNA' 

C) R F 
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1) AAK81858.I, alpha tubulin subunit 



AY862566 

1,016 

267 
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AY862587 

1,018 
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AY862569 

1,016 
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I’suudotsuga nwnziasii . 260/267 (97.4%) 
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1) BAA03955, alpha-tubulin 1 



AY 862.5 64 

1044 

257 

Chloral la vulgaris 245/257 (95,3'X.) 






2) P09204. Tubulin alpha-1 chain 






Chlamvdomonus rcinhurdtii . 245/257 






(95.3%) 

Tua0R4 

AY862568 

975 

325 

1) AAV92379.1, alpha tubulin 1 






Psaudotsuga nwuzinsii, 316/325 (\)7.2%) 


*} 


This includes reliable UNA sequences Indore and after the open reading frame if present, hut not 
the predicted polyA tail. 

II two or more versions ol the same gene occurred in Gen Bank, only the most recent one is listed 
here. II several identical sequences were submitted by the same author(s) and are listed with the 
same submission date, only the one with the last accession number in the numerical sequence is 
listed here. The descriptions are given in the form in which they wen) submitted to GenBank. 
‘Sequence similarities were determined using the standard BLAST program for protein sequences 
to search the GenBank database. The Com posit ion-based statistic option and the filter options were 
not enabled. The BLAST searches were performed on July 29. 2005. 


include) complete carboxy-terminal ends (i.e., TuaCRl, 2, and 3). In this region, 
these three sequences are highly diverse, with TuaCR3 once again exhibiting 
the greatest number of unique residues. 


An additional sequence 



as an alpha-tubulin sequence from 


Ceraloptoris has been deposited in GenBank by a separate research group 
(Salmi ol ol ., 2005). This 825 base-long sequence (GenBank accession 
BE642799) is a single pass sequence from a collection of expressed sequence 
tags. The bl2seq tool on the NCB1 web site was used to compare this sequence 
with the lour alpha-tubulin sequences described here (data not shown). A large 
region of this sequence (ranging from 574 to 640 bases) is 77 or 78% identical 
to the beginning portions of TuaCRl and TuaCR4 respectively. It has no 


sig nificant similarity to TuaCR3. The highest identity (88% fora region that is 
195 bases long) occurs between Ibis sequence and TuaCR2. This region of 
similarity occurs at the beginning of TuaCR2 and at the end of the sequence 
described by Salmi ol al. (2005). Because this sequence was derived from 
a single sequencing experiment, it may be expected to contain a relatively large 
number of incorrect bases. Furthermore, the authors did not provide a deduced 











SCOTT ET AL.: ALPHA-TUBULIN GENE FAMILY IN CERATOPTERIS RICHARDII 


53 


TuaCRl 

TuaCR2 

TuaCR3 

TuaCR4 


MRECISIHIGQAGIQVGNACWELYCLEHGIQPDGQMPSDKTVGGGDDAFNT 


kkkkkkkkkkkkkkkkkkkkkkkk JJ kkkkkkkkkkkkkkkkkkk 


TuaCRl 

TuaCR2 

TuaCR3 

TuaCR4 


FFSETGAGKHVPRAIFVDLEPTVIDEVRTGTYRQLFHPEQLISGKEDAANN 


kkkkkkkkkkkkkk "y kkkkkkkkkkkkkkkkTPkkkkkkkkkk VT 'k'k'k'k'k'k'k'k 


F 


N 


TuaCRl 

TuaCR2 

TuaCR3 

TuaCR4 


FARGHYTIGKEIVDLCLDRIRKLADNCTGLQGFLVFNAVGGGTGSGLGSLL 


kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk H kkkkkkkkkkkkkk 


A 


TuaCRl 

TuaCR2 

TuaCR3 

TuaCR4 


LERLSVDYGKKSKLGFTVYPSPQVSTSWEPYNSVLSTHSLLEHTDVAVLL 


kkkkkkkkkkkkkkkkCkkk 


kkkkkkkk 


M 


kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk 


TuaCRl 

TuaCR2 

TuaCR3 

TuaCR4 


DNEAIYDICRRSLDIDRPTYTNLNRLVSQVISSLTASLRFDGALNVDVTEF 

k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k 
k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k k T k k k 


I 


■k k k k k k k k k k k k k k k JjJ k k k k KT kkkkkTkkkkkkkkkkkkkkkkkkkkkkkk 


N 


I 


TuaCRl 

TuaCR2 

TuaCR3 

TuaCR4 


QTNLVPYPRIHFMLSSYAPVISAEKAYHEQLSVAEITNSAFEPSSMMAKCD 

kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk 

kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk A* * * * * * 

kkkkkkkkkkkkkkkkkkkkkkk A kkkkkkkkkkkkk O kkkkkkkkkkkkk 


S 


TuaCRl 

TuaCR2 

TuaCR3 

TuaCR4 


PRHGKYMACCLMYRGDWPKDVNAAVATIKTKRTIQFVDWCPTGFKCGINY 

kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk 
**********m kkkkkkkkkkkkkdkkkkkkkkkkkkkkkkkkkkkkkkkk 


S 


kkkkkkkkkkkkkkkkkkkkkkkkkk 


TuaCRl 

TuaCR2 

TuaCR3 

TuaCR4 


QPPTWPGGDLAKVQRAICMISNSTSVAEVFSRIDFKFDLMYCKRAFVHWY 
kkkkkkkkkkkkkkkkk V k k k k k k k k k k k k k k k k k V k k k k k k TV k k k k k k k k 


***************** v kkkkkkkkTkkkkkk 


I 


Y 
L*H 


A 


k k k k k k k k k k k k 


TuaCRl 

TuaCR2 

TuaCR3 

TuaCR4 


VGEGMEEGEFSEAREDLAALEKDYEEVGAEGQDDDEPGD DEY 


kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk 


k k k k k k k k k k k k k k k k k k k k k k k 'g* k k k k k 


deg* *ge*g 

DSTEG * GEDEGE 


k k k 


k k 


Fig. 1 . Alignment of the predicted amino acid sequences of four alpha-tubulin proteins from 
Ceratopteris richardii ; the standard symbols for amino acids are used. Only the sequence for 
TuaCRl is complete, and all other proteins are compared to il. Dashes represent unknown amino 
acids, asterisks represent known amino acids that are identical to those shown for TuaCRl, bold 
letters represent known amino acids that differ between TuaCRl and at least one other TuaCR 
copy, and a blank space represents an apparent gap in the alignment. 
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proloin sequence lor this gene. For these reasons, and because it may represent 
an otherwise uncharacterized portion of one of the genes described here, it was 
not included in any of the comparisons described below. 

A BLAST comparison of the four protein sequences for the Ceratopleris 
genes with other alpha-tubulin sequences recorded in GenBank reveals 
a somewhat diverse pattern (see the last column of Table I). First, each of 
the four sequences was more similar to an alpha-tubulin from a different plant 
species than it was to any other Ceratopleris alpha-tubulin sequence. Second, 
the sequences from other plants that are most similar to the Ceratopleris 
sequences are quite diverse. The first two Ceratopteris amino acid sequences 
(those of TuaGK I and TuaGR2) are most similar to sequences derived from 
dicotyledonous plants and/or a gymnosperm, the third is most similar to two 
different algal sequences, and the fourth is most similar to a sequence from 
a gymnosperm. These observations are perhaps more noteworthy in light of the 
fact that two distinct sequences for alpha-tubulin genes from the fern Anemia 
phyllitidis (one is a partial sequence) and two from the moss Bbvscomilrella 
patens are deposited in GenBank. When compared to the Ceratopteris genes 
(data not shown), some of the four lower plant genes have nearlv as many 
identical amino acid sequences as do the sequences from the higher plants and 
algae shown in Table 1. However, when these high levels of identity exist, the 
lower plant sequences also have at least one additional amino acid or at least 
one less amino acid than the Ceratopleris sequences, resulting in gaps in the 
sequence homologies. The homologies indicated in Table 1 do not have such 
gaps and therefore the sequences they refer to are considered to be more 
similar to the Ceratopteris sequences. 

For comparative purposes. BLAST searches were conducted for each of the 
known alpha-tubulin genes ol Arabidopsis to determine what sequences from 
Arabidopsis or other organisms would show the highest similarities to each of 
these genes (data not shown). The entire genome of Arabidopsis has been 
sequenced, and six separate alpha-tubulin genes (TUAl—TIJA6) have been 
located 

dramatical I v different from 
Arabidopsis alpha-tubulin genes show much more similarity to one another 
than do the lour Ceratopleris genes. Among these six genes are two pairs 
whose protein sequences are identical. The proteins of TIJA2 and TUA4 are 
identical, and those oi TIJA3 and TUA5 are identical. Furthermore, TUAO is 


in its genome. The patterns of similarity for these genes 


are 



of Ceratopteris. In general, the six 


almost identical to TUA2/4. 



ung 


448 out of 450 amino acids (99.0 


() 


y 


o 


identity). The amino acid sequences that are the next most similar to these five 
Arabidopsis genes (after comparing them to other Arabidopsis genes) are all 
sequences from angiosperms. The sequence of TUA2/4 is 98.7% identical to 
a sequence from Brassica napns. the sequence of TUA6 is 99.1% identical to 


the same sequence from B. napns . and the sequence of TUA3/5 is 97.1 11 
identical to a sequence Irom Oryza saliva. Only the sequence of the remaining 
Arabidopsis gene product, that of TIJAl, is more similar to the sequence of 
a different plant than it is to another Arabidopsis gene. The TUAl protein 
shares 4 14 amino acids out of 450 (92% identity) with the grass Miscanthas. 
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Carboxy termini of the four protein sequences of 
alpha-tubulins from Arabidopsis thaliana: 

TUA1 AREDLAALEKDYEEVGGEGAEDDDEEGDEY 
TUA2/4 AREDLAALEKDYEEVGAEGGDDEDDEGEEY 
TUA3/5 AREDLAALEKDYEEVGAEGGDDEEDEGEDY 
TUA6 AREDLAALEKDYEEVGAEGGDDEDDEGEEY 


Carboxy termini of three inferred protein sequences 
of alpha-tubulins from Ceratopteris rlchardii: 


TuaCRl AREDLAALEKDYEEVGAEGQDDDEPGD DEY 
TuaCR2 AREDLAALEKDYEEVGAEGDEGDEGEDGDEY 
TuaCR3 AREDLAALEKDFEEVGADSTEGDGEDEGEEY 


Fit;. 2. Comparison of the carboxy terminus regions of Arabidopsis and Ceratopteris alpha-tubulin 
genes. Bold letters represent regions where the sequences within each set are not identical; a blank 
space represents an apparent gap in the alignment. 


To further compare the relative diversity of the Ceratopteris and Arabidopsis 
alpha-tubulin proteins, the sequences 


from the carboxy terminus region, 


which is considered to be highly variable in alpha-tubulin proteins (e.g., see 


Mot 


(Fig. 2). Since only three ot the four Ceratopteris sequences contain the coding 

region for the carboxy terminus portion of the protein, TuaCR4 was omitted 

from this analysis. For both species the amino acid sequence is relatively 

conserved until the last 14 residues at the carboxy terminus end, or in the case 

%/ 

of TuaCRl, which appears to have a single amino acid deletion in this region, 
i be last 13 amino acids. In this region the Arabidopsis sequences share 6 

identical amino acids out of 14, while the Ceratopteris sequences share only 3 
identical amino acids. 

Gene phylogeny results. —The Bayesian tree (Fig. 3) infers phylogenetic 
relationships of nucleotide sequences of the four sequenced C. richardii alpha- 
tubulin genes in relation to alpha-tubulin genes from other plant species. This 
figure depicts the tree from the single run noted in Materials and Methods as 
resolving one ol the nodes, the resolved subnodes being toward the top of the 
tree with probabilities of 0.80, 0.50, and 0.81. The central clade from Zea mays 


22149 to Oryza saliva japonica cultivargroup 113612 ! was not reso ved in any 
of the Bayesian runs. Posterior probability values show that many clades of 
this gene tree are strongly to moderately supported, although some deeper 
branches had weak support. The unresolved and weakly supported clades, 
however, are inconsequential to this paper, which focuses instead on the four 
copies ol C. richardii genes and their placement in strongly supported clades. 
All nodes separating the four Ceratopteris genes are strongly supported. Three 
Ceratopteris copies (TuaCRl, TuaCR2, and TuaCR4) toward the top of Fig. 3 
are relatively closely related to each other in a strongly supported subclade 
that is part of the major clade stretching from C. richardii TuaCRl 29423812 to 
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Fig. 4. Example autoradiograms oi total genomic DNAs from the two parental C. richardii races 
and 10 DHLs derived from their cross, all probed with a DIG-labeled alpha-tubulin Cri. 10 E18 SP(> 
cDNA clone after respective digestions of the genomic DNAs with Hindlll or EcoRl. Arrowheads in 
DHL lanes 2. and 10 indicate at least 13 and 17 probable bands respectively. Segregating co¬ 
dominant and dominant RFLP hands containing alpha-tubulin genes are identified by arrows to the 
__ __ r 

left of each autoradiogram and are discussed in the text and mapped in Fig. 5. 


Oryzci sativa japonica cultivargroup 1136121. This major clade is separated 
irom its sister clade of miscellaneous algae and C. richardii TuaCR3 57867856 
u ’*l!i 1 i ■ i'M'inr probability ot 1.00, although the precise positioning of C. 
richardii TuaCR3 in the algal clade is less strongly supported. 

Mapping results .—Probings of parental and DHL DNAs with the DIG-labeled 
alpha-tubulin Cri 10 E18 SP6 cDNA clone yielded at least ca. 13-17 
restriction fragment bands per DHL. The total number of bands cannot be 
determined with precision because of overlap and faintness of some of the 
bands. As an example, Fig. 4 shows the probing results lor parental 
sporophytes Ha-PQ45 and <DN8 and for ten DHLs whose genomic DNAs were 
cut with restriction enzymes Hindlll and EcoRl, respectively. The 1)11L in lane 
2 of the Hindlll digest, for example, probably exhibits at least 13 bands 
(arrowheads) containing sequence to which the alpha-tubulin probe anneals 
and the DHL in lane 10 shows at least 17 bands. In some cases multiple bands 
may represent a single alpha-tubulin gene sequence that has been cut by the 
Hindlll or FcoRl restriction enzyme. Although the sequence of the cDNA probe 
used in this alpha-tubulin study contains no Hindlll or EcoRl restriction site, 
the coding region of the target DNA might contain such sites, and introns 
within the genes ol the target DNA may contain Hindlll or FcoRI sites. No data 
are available to address these possibilities. 

Mapping can be performed only for those parental alpha-tubulin gene copies 
contained in restriction fragments that are polymorphic between the two 
! ):il ,:[, l s ckI ih;ii I iierelore segregate in the DHLs. Four such unequivocal 
segregating sets of restriction fragments were observed in the genomic DNAs 
digested with Hindlll and FcoRI in our mapping population. These 
identified (fig. 4) as H20046154 1 and H20046154 2 scored from the Hindlll 
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digest and E20046154 1 E20046154 2 scored from the EcoR\ digest. Poly¬ 
morphic restriction fragment markers are usually expressed as co-dominant 
hands, meaning that the gene’s presence is visualized in respective hands from 
both parents. In Eig. 4, this is exemplified in the Hindi 11 digest by locus 
H20046154 1 where the larger fragment from parent H7.-PQ45 (also seen in 
DHL lanes 1, 2, 6, 10) is ca. 1 mm closer to the top of the figure than is the 
smaller fragment from parent <I>N8 (also seen in DHL lanes 2, 4, 5, 7, 8, 0). 
Locus E20046154 1 in the EcoRl digest in Eig. 4 shows a similar pattern. In 
this case the smaller fragment in parent Hot-PQ45 (also seen in DHL lanes 1. 2, 
6. 10) is ca. 1.5 mm farther from the top of the autorad than is the larger 
fragment from parent <1>N8 (also seen in DHL lanes 2, 4, 5, 7, 8, 9). The two 
hands from the llindlll digest co-segregale with the two bands from the Ec:oR\ 
digest in all of the mapping population’s DHLs, indicating that they either 
mark the same identical locus visualized in the two different digests or mark 
two loci so closelv linked that they show no crossover distance between them 

%/ V 

(i.e., they map to the same location). Locus 1120048 !54 2 on the / IindU\ digest, 
on the other hand, is visualized as a dominant hand in parent 0N8 (also seen 
in DHL lanes 2, 4, 5, 6, 7, 8) meaning that no alternative band expression is 
visualized in the Hoc-PQ45 parent (and in DHL lanes 1, 2, 9, 10). The reason lor 
this dominant pattern is presently unknown but may he because the 


H20046154 2 gene copy in the 


PQ45 parent has been lost or has been 



marking 1120048154 2 and 


moved to a different part of the genome where it cannot be scored because it 
overlaps with a different band, etc. In the EcoR\ digest, locus E20046154 2 is 
also expressed as a dominant marker, present in parent Ha-PQ45 (and in HI IL 
lanes 1,4, 5, 9, 10), but with an alternative band lacking in parent 0N8 (and in 

DHL lanes 2, 3, 8, 7, 8). Clearly the 

- 

E20048154_2 do not co-segregate in the DHLs ol the two digests, indicating 
that they mark different loci. The mapping program places 1120048154_2 on 
linkage group (LG) 28 and E20046154 2 on LG 24 (Eig. 5 at arrows), whereas 
H20048154 1 and E20048154 1 map to the same position on LG 17 (Eig. 5 

arrow). 


Disci jssion 


Map positions of alpha-tubulin genus .—The genetic linkage mapping project 
for richardii , fully described bv Nakazato et al. (2006), has identified 41 
linkage groups that partly correspond to the 29 chromosomes per haploid set 
in this species. Alpha-tubulin gene copy H20046I54_2 is located on LG 28, 
H&E20046154 I is on LG 17, and E20046154 2 is on LG 24 (Eig. 5). Also seen 
on these three linkage groups are a large number ol AELE markers, each 
identified by a lowercase “a" followed by the primer pair used to generate it, 
and additional KELP markers whose code numbers are the GenBank gi 
numbers of the cDNA library clones used as probes, prefaced with an “H” or 
an “K" to indicate whether that locus was visualized on the llindlll or the 


EcoRl 



On LG 17 is also found the isocitrate dehydrogenase (IDH) 


isozyme locus, which maps to virtually the same position as H9960276_2. In 
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Fit:. 5. Three of the 41 linkage groups presently identified in the C. richurdii genome, showing at 
arrows the positions of the four segregating RFLP hands that contain alpha-tubulin genes identified 
in Fig. 4. 



ail cases, the cumulative distance in centiMorgans from the topmost maker is 
given to the left of each linkage group. 

Because the alpha-tubulin clone used as a DIC-labeled probe here anneals to 
every alpha-tubulin gene copy with sufficient secptence identity, given the 
stringency conditions of our general mapping project, we cannot determine 
which oi the genes in Table 1 are represented by respective mar 
! [&k2()046154_l, H20046154_2, and E20046154 2. The large number of bands 
(at least 13 to 17 in the indicated lanes in Fig. 4) suggests a large alpha-tubulin 
gene family in C. richardii, even if some genes are represented by more than 
one band. We were unable to map more than three or four alpha-tubulin loci 
because only four loci were contained in restriction fragments polymorphic in 
the two parents in HindlU and EcoRl digests. 

Ceratopteris alpha-tubulin gene phytogeny. —The phylogenetic relation¬ 
ships of the four C. richardii genes in this paper are illustrated in Fig. 3. 
Bayesian posterior probability values show that copies TuaCRl and TuaCR2 at 
the top of the tree are very strongly grouped as sister to each other and that 
both are in turn robustly sister to a copy from the conifer Pseudotsuga 
menziesii. Also with very strong support, the preceding are separated from 
TuaCR4 which is sister to an alpha-tubulin gene copy from the fern Anemia 
phyllitidis. The most parsimonious hypothesis to explain these present data is 
that TuaCRl and TuaCR2 result from a recent duplication, perhaps after the 
divergence ol ferns and seed plants, and that the gene duplication leading to 
the TuaCR4 lineage and the TuaCRl+TuaCR2+Pseudo/sugo lineage preceded 
the divergence of ferns and seed plants. Ceratopteris richardii TuaCR3 (Fig. 3), 
on the other hand, is grouped in a clade of algal alpha-tubulin gene copies 
below the center of the tree. Although the precise relationships within this 
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dado of algal genes+TuaCR3 are not 


strongly supported, 


a 1.0 posterior 

probability value strongly separates this dade from the other three Ceratopteris 
genes in the major sister dade of miscellaneous seed plant alpha-tuhulin gene 
copies above it. This indicates that TuaCR3 had already diverged from the 
common ancestor of the other three Cemtopteris alpha-tubulin genes by the 
time of the common ancestor of the ferns and algae, preceding the divergence 
of algae and land plants. The sampling of alpha-tubulin genes from plants in 
general, however, is currently very incomplete. The four C. richcirdii copies 
discussed here are simply those expressed in a cDNA library derived trom 12- 
day-old gametophytes. Surely they do not represent the full range ol alpha- 
tubulin gene copies expressed in the life cycle ol C. richcirdii, just as the two 
alpha-tubulin gene copies sequenced thus far from the fern Anemia phvllitidis 
in Fig. 3 must represent onlv a fraction of the copies from that species. Interred 
timings of duplications will likely change when more alpha-tubulin gene 
sequences become available. It may be noteworthy that C. richcirdii and the 


algae have motile gametes whereas all of lhe seed plants in Fig. 3 (including 


Pseudotsnga) lack motile sperm. This suggests TuaCR3 as a candidate for the 
gene copy functioning in sperm motility microtubules. 

The occurrence of distantly related beta- and gamma-tubulin genes resulting 


from ancient duplications and the large number of alpha-tubulin gene copies 
seen in the C. richcirdii genomes in Fig. 4 indicate that there has been extensive 
and continuous duplication of tubulin genes throughout the evolution ol 
organisms. It is also likely that deletion/silencing of tubulin gene copies is 
frequent, perhaps in lineage-specific ways. For example, despite intensive 
sequencing of alpha-tubulin genes in seed plants, gene copies from seed plants 
are absent from the lineage containing the TuaCR3 and algal gene copies in 
Fig. 3. This suggests that silencing/deletion of this copy may have occurred 
specifically in the seed plant lineage. 

Diversity of the C. richardii alpha-tuhulin genes. —In terms of amino acid 
sequences, particularly the carboxy termini, the four Ceratopteris alpha- 
tuhulin sequences appear to be relatively diverse both when compared to one 
another (Figs. 1 and 2) and when compared to sequences from other plants (see 
Table 1, Fig. 2, and text of the Results section). The structural similarity ol the 
Ceratopteris alpha-tubulin protein sequences to those of other diverse species 
is not surprising. This kind of similarity is often seen for alpha-tuhulin 
proteins when sequences are compared using the BLAST algorithm (data not 
shown). The phylogenetic relationships of these four C. richardii sequences 
determined by Bayesian analysis of their nucleotide sequences excluding the 
invariant first codon and the non alignable 45 carboxy terminus nucleotides, 
however, indicates less diversity (big. 3). Phylogenetic analysis shows that 
TuaCRl is closest to TuaCR2 and that, except for the alpha-tubulin copies from 

to TuaCRl and 

TuaCR2. TuaCR3, on the other hand, is quite distant from the other three C. 
richardii copies and is most closely related to algal alpha-tuhulin genes. 

Potential insights from further studies of the Ceratopteris alpha-tuhulin 
gene family. —Two main types of insights may be gained by studying the 


Pseudolsm’a and Anemia, TuaCR4 is most closely re 
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diverse genes of organisms like Ceratopteris, and these are discussed in turn 
below. First, the diversity maintained in gene families like that of alpha- 
tubulin may provide new insights into the evolutionary history of specific 
genes as well as the mechanisms bv which they evolve. The conventional 
wisdom regarding gene families is that the multiple versions of related genes 
develop by gene duplication and random mutation that produces variants 
subject to varying degrees of selection. In some cases, the different orms of 
these related genes assume different roles over the course of evolutionary 
history (e.g., see Zhang, 2003; Irish and Litt, 2005). 

Consider the comparison presented earlier between the alpha-tubulin genes 
of Arabidopsis and those of Ceratopteris. The relative similarity in amino acid 
sequence ol the six alpha-tubulin genes of Arabidopsis thaliana noted above 
suggests at least two likely interpretations, f irst, selection pressure may be 
very high for alpha-tubulin genes in this species, such that very little deviation 
is tolerated. Second, there may be relatively fewer potential roles for alpha- 
tubulin in the development and lile history of Arabidopsis (for example, sperm 
are non-motile, unlike the situation in ferns) and therefore evolutionary factors 
have resulted in less diversity. In this respect, it is noteworthy that the 
Arabidopsis alpha-tubulin gene that differs most from the other five, TUAl, 
appears to be preferentially expressed primarily in pollen grains (Carpenter ef 


al. 1992), while the others are apparently expressed in various tissues 
throughout the plant (Kopczak et al., 1992). 

by comparison with the alpha-tubulin genes of Arabidopsis, amino acid 
sequences ol the four known alpha-tubulin genes of Ceratopteris appear 
somewhat diverse, suggesting selective pressures favoring the origin or 
maintenance of alpha-tubulin variation in this fern. This may relate in part 
to its fully independent gametophyte generation with perhaps more potential 
roles for alpha-tubulin (lor example, as a component of the flagella in 
swimming sperm) than are found in its angiosperm counterpart, Arabidopsis. 
II these considerations prove correct and generally applicable to other genes, 
new insights regarding evolutionary mechanisms could be gained from studies 
focusing on C. richardii as a model homosporous vascular plant that could 
never be gained by studying plants like Arabidopsis alone. 

A second type ol information to be gained by studying diverse members of 
gene families, such as those of the C. richardii alpha-tubulin family, relates to 
gene expression. For example, what are the specific roles of the various 
members ol the gene family? When during development and under which 
environmental conditions are these genes expressed? How is the expression of 
each gene controlled? To provide some insight into how C. richardii can be 
utilized to answer such questions, two potential research strategies utilizing 
the new sequence data presented in this report are described below. 

A large number of morphological mutants exist from classical mutagenesis 
and selection screens using Ceratopteris (reviewed in Hickok, 1987; Hickok et 
al., 1987, 1995). Some attec cell shape, like the dwarf or bubbles mutant 
(Hickok, personal communication), others affect intracellular organization, 
like the polka-dot mutant (Vaughn et al., 1990), while yet others, like sleepy 
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sperm (Renzaglia el al ., 2004) affect the mobility o' the sperm. Based on the 
phenotypes of the above mutants, it is likely that the defect associated with 
some of them may involve either an alpha-tubulin gene itself, a gene that 
controls expression ol an alpha-tubulin gene, or a gene lor another protein that 
interacts with alpha-tubulin. With the lour sequences presented in this report, 
it will be relatively simple to assess the first two possibilities, using BCR to 
obtain sequence data and techniques like reverse transcr iplion-PCR (RT-PCR) 
and/or real-time PCR to study relative levels of gene expression. 

One mav also study the roles of the various alpha-tubulin genes by utilizing 

. *• " 

recently developed methods for inducing gene silencing via RNA interference 
(RNAi) in spores and gametophytes of Ceratopteris (Stout et al., 2003; 
Rutherford et al., 2004). RNAi is a technique whereby specific genes may be 
silenced by the introduction of short single- or double-stranded RNA 
sequences into an organism's cells. These short RNA sequences, which must 
be highly similar to or identical to the complementary coding sequences 
within the targeted gene, trigger a cellular response that causes the degradation 
of the mRNA transcribed from the targeted gene. 

In the two studies published so far using RNAi in C. richavdii, the short 
RNAs have been introduced either by simply soaking spores directly in 
a solution containing the RNAs (Stout et al., 2003) or by using the biolistic 
method to deliver the RNA into gametophytic cells (Rutherlord et al., 2004). 
The effects of gene silencing in C. richavdii appear to be transmissible to cells 


descended from the cell into which the inhibiting RNA was initially 
introduced. In experiments where the inhibitory RNA was introduced into 
cells of young gametophytes by biolistic delivery, an observable phenotype 
was sometimes even transmitted to developing sporophytes (Rutherlord et al., 

2004). 


Each of the four alpha-tubulin genes described here has at least several 
portions of its sequence that are not shared by the other known members ot this 
gene family (this is especially evident in the 3’ regions ol TuaCRl, 2, and 3). 
These sequences can be used to design short inhibitory RNAs that should be 
able to silence specifically the expression of their respective genes. 
Alternatively, it should be possible to silence two or more ot these genes at 
the same time by using short RNA sequences that are shared by two or more 
genes. In either approach, such silencing can be confirmed by demonstrating 
an absence of (or reduction in) the targeted mRNA by using RT-PCR (Stout et 
al., 2003) or real time PCR (Rutherford et al., 2004), and phenotypes resulting 
from the silencing may also be observed in some cases (Rutherlord et al., 2004). 

Several potential phenotypes could be easily observable as a result ol 
silencing alpha-tubulin expression in Ceratopteris. Some may include such 
developmental defects as the inhibition of spore germination, or the generation 
of abnormal rhizoids, prothalial cells, or specialized ceils such as those of 
antheridia or archegonia. Cytoskeletal defects generated by Ibis approach may 
also lead to observable phenotypes—some may be associated with cell size or 
shape, while others may be as striking as the phenotype of the polka-dot 
mutant. Other observable abnormalities mav affect the function ol spermata- 
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zoids, generating phenotypes similar to the sleepy sperm mutant. The obvious 
advantage to generating such phenotypes through RNAi is that in each case, 
the phenotype will be directly associated with the silencing 01 one or more 
specific alpha-tubulin genes. This in turn may lead to the assignment of 
specific roles for each member of this gene family. 

The linkage positions of the three mapped loci together with the specific 
sequencing data provided here should facilitate future cloning of genomic 
sequences containing alpha-tubulin genes in C. richardii. This will enable 
characterization of the organization and controlling elements of these genes, 
enhancing our understanding of the evolution, function, and regulation of the 
alpha-tubulin gene family in vascidar plants in general. The research 
possibilities discussed here illustrate how robust molecular techniques can 
be applied to the C. richardii system, furthering its usefulness as a model 
system. 
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